Clustering and dendrogram

Sex seems to be the determinng factor in hierarchical clustering. Neither disease status nor ethnicity seem to be clustered in any meaningful manner. Also, one sample seems to have a mismatched ‘sex’ label.

Heatmap

Heatmap with the same clustering. Highly distant groups in rows are separated by sex.

PCA

Screeplot

It would take 119 principal components to capture 90% of variance in the data.

Pair plot

Sex by colors. Disease status by shape. The sample with a mismatched ‘sex’ label is visible here too.

PC Heatmap

The heatmap of first 7 PCs.

=======

The heatmap of principal component scores doesn’t reveal much.

>>>>>>> 68a13adc78dc95f27af127cf4919c6d1de051d48

PC relation with metadata

Seems like the only relevant characterictis separated by first 5 PCs is sex.

Sex

Disease Status

Age

The lighter the point, the higher the age.

Ethnicity

It seems that only sex is associated with any of the first 5 PCs.